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Universal Coronavirus Vaccine 

cross reference to related applications 

This application is a continuation-in-part 
application of U.S. application serial number 07/882,171, 
5 filed May 8, 1992, pending, which is a continuation-in-part 
of U.S. application serial number 07/698,927, filed May 13, 
1991, which is a continuation-in-part of U.S. application 
serial number 07/613,066, filed November 14, 1990, each of 
which is incorporated herein by reference. 

10 Field of the invention 

The present invention relates to a universal vaccine 
useful to protect different species of animals against 
infection by different host-specific coronaviruses. 

Background of the invention 

15 Coronaviruses are a family of host-specific 

enveloped RNA viruses with a single-stranded positive sense 
genome. Examples of coronaviruses include, but are not 
limited to: feline infectious peritonitis (FIPV) and feline 
enteric coronavirus (FECV) which are specific to felines; 

20 canine coronavirus (CCV) which is specific to canines; 
transmissible gastroenteritis coronavirus (TGEV) which is 
specific to swine; bovine coronavirus (BCV) which is specific 
to bovine species; human coronavirus which is specific to 
humans; mouse hepatitis virus (MHV) which is specific to 

25 murine species; and infectious bronchitis virus (IBV) which 
is specific to avian species. These host-specific 
coronaviruses cannot cross infect different species of 
animals. Viral infection of the host by a coronavirus can 
cause symptoms ranging from mild enteritis to severe 

30 debilating disease to, in some cases, death. 

Coronaviruses share common structural features 
including a spike or S protein (also referred to as a peplomer 
protein) . The S protein is a glycoprotein which protrudes 
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from the surface of the virus particle. The S protein 
mediates the binding of virions to the host cell receptor and 
is involved in membrane fusion. In addition, it is the target 
of virus neutralizing antibodies. 
5 S proteins contain an N-terminal signal sequence, 

a C-terminal transmembrane segment and potential N-linked 
glycosylation sites. Comparison of different coronavirus S 
proteins show little homology, i.e. similarity, at the N 
terminus and highly conserved amino acid sequences at the C 

10 terminus. Because the tissue tropism and disease 

symptomatology is quite varied among this virus family, it is 
speculated that the pathogenesis of coronaviruses is 
determined by the sequences encoded at the N-terminus while 
the more conserved C-terminus encodes critical structural 

15 features common to all coronaviruses. The carboxy terminus 
of the S protein is believed to be involved in fusion- 

The structure of the S protein has been studied. 
Cavanagh (1983) J*. Gen. Virol. 64:2577-2583, which is 
incorporated herein by reference, proposed a model for the 

20 coronavirus spike in which the C-terminal half of the protein 
forms its stalk and the N-terminal half, its bulbous protein. 
deGroot et al. r (1987) «J. Mol. Biol. 197:, which is 
incorporated herein by reference, have postulated a model in 
which a coiled-coil structure forms the connection between the 

25 globular part of the S protein and the viral membrane. This 
model is based on the occurrence of heptad repeats, i.e., a 
periodicity (a-b-c-d-e-f-g) in which the amino acids are 
hydrophobic. Britton (1991) NaturB 353:394, which is 
incorporated herein by reference, reported the presence of a 

30 leucine zipper motif at the carboxy 1 end of the S glycoprotein 
of coronaviruses for which the spike sequence is available: 
TGEV FS772/70 (amino acids 1342-1377) , FIPV WSU 1146 (amino 
acids 1345-1380), MHV A59 (amino acids 1217-1252), human 
coronavirus 229E (amino acids 1067-1102) , BCV Mebus (amino 

35 acids 1266-1294) , and infectious bronchitis virus Beaudette 
(amino acids 1059-1079). The leucine zipper motif terminates 
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ten residues upstream of the conserved KWP motif preceding the 
transmembrane domain. 

Efforts have been made to develop vaccines against 
various host-specific coronaviruses. Attempts have been made 
5 with varying success to develop attenuated live virus 
vaccines , inactivated vaccines, subunit vaccines and 
recombinant nucleic acid based vaccines. In each case, the 
vaccine developed did not cross-protect other host animals. 
Vaccines currently available for protection against 

10 coronavirus are specific for protection against a given member 
of the coronavirus family. Such vaccines do not provide cross 
protection to protect a host against other members of the 
coronavirus family which are able to infect the species. 
Furthermore, such vaccines do not cross protect other animals 

15 against coronaviruses for which they are susceptible to 
infection. 

There is a need for a vaccine which can protect 
against coronavirus infection. In particular, there is a need 
for a vaccine which can be useful to protect a host species 
20 against different coronaviruses and there is a need for a 
vaccine which can be useful to protect different host species 
against different coronaviruses. 

Summary of the invention 

The present invention relates to a polypeptide 
25 comprising an amino acid sequence from the C terminal portion 
of a coronavirus S protein which has been found to be highly 
conserved among coronaviruses and which is capable of 
eliciting a protective immune response. This sequence is 
referred to as a universal conserved domain- The polypeptides 
30 of the present invention have less than a complete amino acid 
sequence of an S protein. 

The present invention relates to a vaccine 
comprising a polypeptide which includes an universal conserved 
domain and which has less than a complete amino acid sequence 
35 of an S protein. 
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The present: invention relates to an isolated nucleic 
acid molecule having a nucleic acid sequence which encodes a 
polypeptide that includes a universal conserved domain 
polypeptide and that has less than a complete amino acid 
5 sequence of an S protein. 

The present invention relates to a vaccine 
comprising a nucleic acid molecule that encodes a polypeptide 
which includes an universal conserved domain and which has 
less than a complete amino acid sequence of an S protein. 

10 The present invention relates to a method of 

protecting an animal from infection by a coronavirus 
comprising administering an amount of a polypeptide effective 
to elicit a protective immune response. The polypeptide 
administered in the method comprises a universal conserved 

15 domain and has less than a complete amino acid sequence of an 
S protein. 

The present invention relates to a method of 
protecting an animal from infection by a coronavirus 
comprising administering an amount of a nucleic acid molecule 
20 which encodes a polypeptide effective to elicit a protective 
immune response. The polypeptide encoded by the nucleic acid 
molecule administered in the method comprises a universal 
conserved domain and has less than a complete amino acid 
sequence of an S protein. 

25 Detailed description of the invention 

According to the present invention , a highly 
conserved region of the spike protein has been identified 
which, when presented as a vaccine component or product, is 
useful as a universal immunogen to protect an animal against 

30 coronavirus infection. The vaccine of the present invention 
may be used to vaccinate any animal susceptible to infection 
by virus that is a member of the coronavirus family. 
Accordingly, the present invention provides vaccines which can 
be produced in a single manufacturing process and administered 

35 to different species of animals. The cross-protection 
afforded by vaccines of the present invention eliminates the 
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need to produce different vaccines to protect animals against 
different members of the coronavirus family. 

As used herein, the term "polypeptide" is meant to 
refer to a peptide, polypeptide or protein molecule; a 
5 molecule which includes a peptide, polypeptide or protein 
molecule; or a molecule that contains amino acid residues 
which are linked by non-peptide bonds. 

As used herein, the term "universal conserved 
domain" ("UCD") is meant to refer to the identical 124 amino 

10 acid segment found in the C terminal portion of S proteins 
from TGEV, CCV and strains of feline cor ona viruses . In 
addition, the term "UCD" is meant to refer to the 
corresponding amino acid segments of other coronavirus which 
have different but homologous amino acid sequences. Such 

15 corresponding sequences may be identified by their location 
in the 5 protein, i.e. downstream of the bulbous N-terminal 
region and upstream of the transmembrane region and the high 
level of amino acid sequence similarity to the 124 amino acid 
sequence described above. Furthermore, the term "UCD" is 

20 additionally meant to refer to consensus sequences are 
generated by comparing corresponding sequences and determining 
the statistically average amino acid residue at a given 
position in the sequence. Thus, when several different 
sequences are compared, the most common residue at a given 

25 position is assigned to that position in a consensus sequence. 

The conservation of UCD sequences suggests that they 
play a major role in virus structure and/or replication. The 
region of perfect homology decreases in size as other 
coronavirus S genes are included in the comparison. For 

30 example, bovine and human coronavirus are more closely aligned 
to the feline, canine and porcine coronavirus S genes in this 
conserved region than are sequences from the murine and avian 
coronaviruses. 

Table 1 contains a comparison of corresponding amino 

35 acid sequences from the C terminal portion of various 
coronaviruses, SEQ ID N0:1 is an amino acid sequence from 
FIPV strain Wsue2 (Virulent, Type II; Genbank accession number 



WO 93/23421 PCT/US93/04365 

- 6 - 

X06170) . SEQ ID NO: 2 is an amino acid sequence from FIPV 
strain Df2e2 (Virulent , Type II). SEQ ID NO: 3 is an amino 
acid sequence from FIPV strain Tse2 (Temperature sensitive 
mutant of Df2) . SEQ ID NO: 4 is an amino acid sequence from 
5 FECV strain Fecve2 (Avirulent strain 1683) . SEQ ID NO: 5 is 
an amino acid sequence from TGEV strain Tgeve2 (Purdue strain; 
Genbank accession number D0C118) . SEQ ID NO: 6 is an amino 
acid sequence from FIPV strain Tgeve2f2 (Miller strain; 
Genbank accession number M56002) . SEQ ID NO: 7 is an amino 

10 acid sequence from BCV strain Bcve2 (Genbank accession number 
M30613). SEQ ID NO: 8 is an amino acid sequence from HCV 
strain Hcve2 (Genbank accession number X16816) . SEQ ID NO: 9 
is an amino acid sequence from IBV strain Ibbspi (Genbank 
accession number X16816) . SEQ ID NO: 10 is an amino acid 

15 sequence from MHV strain Mhve2a59 (Genbank accession number 
X51939 SEQ ID NO: II is an amino acid sequence from FIPV strain 
Mhvs (Genbank accession number X04797) . SEQ ID NO: 12 is a 
consensus sequence which has been designed to provide an 
optimum tJCD amino acid sequence. 

2 0 The 124 residue amino acid sequence which is 

completely conserved in TGEV, CCV and feline coronaviruses is 
shown in SEQ ID NO:l, SEQ ID N0:2, SEQ ID NO:3, SEQ ID NO:4 
and SEQ ID NO: 5 from residue 37 to residue 160. The consensus 
sequence, SEQ ID NO: 12, also contains this 124 amino acid 

25 sequence in its entirety from residue 37 to residue 160. This 
124 amino acid sequence is currently a preferred UGD sequence 
of the present invention. The entire 199 amino acid consensus 
sequence is a preferred tJCD-containing peptide. 

Using amino acid sequence information from any 

30 coronavirus, one having ordinary skill in the art can identify 
the conserved region corresponding to the 124 amino acid 
sequence found in TGEV, CCV and feline coronaviruses. As 
exemplified in Table 1, the aki.no acid sequences from the C 
terminal portion of coronaviruses can be compeared to identify 

35 the sequence which corresponds to the UCD from TGEV, CCV and 
feline coronaviruses. The procedure is straightforward and 
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can be performed to provide additional UCD sequences .and 
flanking sequences. 

Corresponding conserved regions from coronaviruses 
other than CCV, TGEV and feline coronaviruses may be 
5 identified by their location on the S protein and the high 
level of sequence homology the possess when compared to the 
124 amino acid sequence referred to above. An example of such 
comparison and identification is shown in Table 1 in which 
sequences from the C terminal regions of various S proteins 
10 upstream from the transmembrane region are compared and 
homologous sequences identified. Widely available computer 
programs such as PLOTSIMILARITY software (Genetics Computer 
Group, Madison WI) may be employed to locate a UCD in a 
coronavirus . 

15 In addition, such software may be employed to 

expedite the generation of consensus sequences . This software 
relies on the principles originally set out by Wilbur and 
Lipman and later refined by Smith and Waterman and by 
Needleman and Wunsch. Using these well known guidelines, 
20 having ordinary skill in the art may compare sequences and 
arrive at the statistically average or most common residue 
occupying a given position. The PLOTSIMILARITY software 
automates this function. Consensus sequences are thus 
generated. In addition to the consensus sequence provided as 
25 SEQ ID NO: 12, a different consensus sequence derived from a 
comparison of corresponding sequences is disclosed in the co- 
owned, co-pending patent application: which is filed on the 
same day as the present application; which is entitled 
"Compositions and Methods for Vaccinating Coronaviruses"; 
which names the same inventors as the present application 
(Miller, Timothy J. ; Jones, Elaine V.; Reed, Albert P.; and 
Klepfer, Sharon R) ; which has been designated docket number 
H85009-1 by Applicants; and which is incorporated herein by 
reference. 

Accordingly, the present invention relates to 
polypeptides which comprise a UCD or a fragment or a 
derivative thereof. That is, the present invention relates 



30 



35 
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to polypeptides which comprise: the 124 amino acid sequence 
form TGEV, CCV and feline coronaviruses ; or the different 
amino acid sequences from other coronaviruses which correspond 
to the 124 amino acid sequence; or a consensus sequence 
5 generated from comparison of corresponding regions; or 
immunogenic fragments or immunogenic derivatives thereof. 

Polypeptides according to the present may further 
comprise additional flanking sequences from coronavirus or 
flanking sequences designed as a consensus sequence of the 

10 flanking sequences of corresponding regions from different 
coronaviruses „ 

As used herein, the term "immunogenic fragment" is 
meant to refer to polypeptides which include an incomplete UCD 
which is capable of eliciting a protective immune response 

15 against coronavirus in an animal susceptible to coronavirus 
infection. Immunogenic fragments may comprise a sequence 
having nine or more amino acids from a UCD, and may include 
additional amino acid sequences. 

As used herein, the term "immunogenic derivatives" 

20 is meant to refer, to molecules which have a UCD or portions 
thereof with conservative amino acid substitutions and which 
are capable of eliciting a protective immune response against 
a coronavirus in an animal susceptible to coronavirus 
infection. Those having ordinary skill in the art can readily 

25 design derivatives having UCD sequences with conservative 
substitutions for amino acids. For example, following what 
are referred to as Dayhof r s rules for amino acid substitution 
(Dayhof, M.D. (1978) Nat. Biomed. Res. Found., Washington, 
D.C. Vol. 5, supp. 3), amino acid residues in a peptide 

30 sequence may be substituted with comparable amino acid 
residues. Such substitutions are well known and are based the 
upon charge and structural characteristics of each amino acid. 

Using standard procedures and readily available 
starting materials, one having ordinary skill in the art can 

35 determine whether a fragment and derivative is an immunogenic 
fragment or an immunogenic derivative, respectively. Briefly, 
polypeptides can be produced by standard methodologies and 
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tested to determine whether they are capable of eliciting, a 
protective immune response. Sera from vaccinated animals can 
be analyzed to detect the presence of antibodies capable of 
inhibiting infection of cells in culture. Furthermore, 
5 challenge studies can be performed to determine if animals 
vaccinated with a polypeptide are protected from subsequent 
infection by wild type virus. One having ordinary skill in 
the art can routinely produce and screen fragments and 
derivatives to determine the effectiveness of such vaccine 

10 components to elicit protective immune responses. Similarly, 
larger molecules may also be screened by the same means to 
detect their ability to elicit a protective immune response. 

The UCD lies near the transmembrane region of the 
S protein. Because this region of the S protein is purported 

15 to be involved in the secondary structure of the glycoprotein, 
in receptor binding and in virus-induced cell fusion, the UCD 
plays an important role in the function of the S protein and 
in the formation of infectious virus. Inducing an immune 
response against this region will interfere with the folding 

20 of the S glycoprotein into its proper conformation. The 
presence of circulating antibodies to this region could bind 
to either virus or infected cells expressing the glycoprotein 
on the surface. Virus completed with antibody may be unable 
to bind to receptors on susceptible cells and/or initiate the 

25 pathway required to gain entry which involves a conformational 
change of the S protein. Recognition of this region on the 
surface of infected cells would target them for clearance. 
Antibody binding to the conserved region of the S protein 
surface expressed by infected cells would, most likely, 

30 prevent cell fusion and interfere with virus assembly. 
Regardless of mechanism, an immune response to the UCD of a 
coronavirus S protein will inhibit virus spread from cell to 
cell and limit virus infection. 

Polypeptides according to the present invention 

35 comprise less than a complete S protein sequence. In 
particular, the polypeptides do not comprise a complete N- 
terminal portion of an S protein and preferably comprise few 
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or no amino acid sequences from the N-terminal bulbous portion 
of the protein. Furthermore, the polypeptides preferably do 
not comprise a complete transmembrane domain of an S protein. 
In some preferred embodiments, polypeptides comprise no more 
5 than a 400 amino acid sequence upstream (from the C terminus 
to the H terminus) from about 2 amino acids upstream from the 
transmembrane domain. In some preferred embodiments, 

polypeptides comprise no more than a 300 amino acid sequence 
upstream (from the C terminus to the N terminus) from about 

10 5 amino acids upstream from the transmembrane domain. 

In some preferred embodiments, polypeptides which 
comprise a UCD, or derivatives and/or fragments thereof 
further comprise flanking sequences of the UCD found in 
coronavirus. For example, in some preferred embodiments, the 

15 polypeptide comprises portions of the S protein flanked by and 
optionally including the heptad repeats reported by deGroot 
et al., such as, for example, in FIPV strain WSU 1146 from 
residues 1067 to 1380. In some preferred embodiments, the 
polypeptide comprises portions of the S protein flanked on the 

20 carboxy side by and may also include a leucine zipper motif 
as reported by Britton. In some preferred embodiments, the 
polypeptide comprises portions of the S protein from about 300 
residues upstream of the transmembrane region to about 5 amino 
acid residues upstream from the transmembrane domain. 

25 In some preferred embodiments, the polypeptide 

comprises a UCD about 124 amino acids in length. In some 
preferred embodiments, . the polypeptide comprises an 
immunogenic fragment of a UCD about 100 amino acids in length. 
In some preferred embodiments, the polypeptide comprises an 

30 immunogenic fragment of a UCD about 50 amino acids in length. 
In some preferred embodiments, the polypeptide comprises an 
immunogenic fragment of a UCD about 25 amino acids in length. 
In some preferred embodiments, the polypeptide comprises an 
immunogenic fragment of a UCD about 15 amino acids in length. 

35 In some preferred embodiments, the polypeptide comprises an 
immunogenic fragment of a UCD about 10 amino acids in length. 
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In some preferred embodiments, a UCD comprises amino 
acid residues 37-160 of SEQ ID NO: 12. Additional preferred 
embodiments comprise SEQ id NO: 12. Other preferred 
embodiments of the invention comprise SEQ ID NO:l, SEQ ID 
5 NO: 2, SEQ ID NO: 3, SEQ ID NO: 4 or SEQ ID NO: 5. Other 
preferred embodiments comprise SEQ ID NO: 6, SEQ ID NO: 7, SEQ 
ID NO:8, SEQ ID NO:9 f SEQ ID NO:10 or SEQ ID NO: 11. 

In addition to a UCD and, optionally, additional 
flanking segments from an S protein, other peptide segments 
10 may also be included in the polypeptide of the present 
invention. Such additional peptide segments may comprise 
other immunogenic targets from coronavirus and/ or other 
pathogens, and/ or they may be provided for improved stability, 
UCD epitope presentation or production/purification 
15 facilitation. The resulting polypeptide is considered a 
chimeric or fusion polypeptides. 

Vaccines according to the present invention can be 
employed to vaccinate animals against infection by 
coronaviruses or at least to prevent the clinical symptoms 
20 associated with such infections. Such vaccines will provide 
protection against multiple coronaviruses and cross species 
protection. Vaccines may be produced which are either 
protein-based or nucleic acid-based. in both cases, the 
vaccinated animal is exposed to an immunogenic polypeptide 
25 which comprises a UCD. A protective immune response is 
elicited which is sufficient to protect the animal against 
coronavirus . 

Vaccines according to the present invention can be 

either: 

30 a ) compositions which comprise a polypeptide that 

includes a universal conserved domain; or 

b) compositions which comprise a nucleic acid 

molecule that includes a nucleotide sequence which encodes 

a polypeptide that includes a universal conserved domain. In 
35 both types of vaccines, the polypeptide is not a complete S 

protein and it elicits a protective immune response in 

animals. 
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In protein based, i.e. subunit vaccines, 
polypeptides having a UCD may by produced using standard 
techniques including recombinant DNA techniques for protein 
production or by peptide synthesis. In preferred embodiments, 
5 polypeptides used in subunit vaccines according to the present 
invention are produced by recombinant DNA methodology. 

The nucleic acid sequences of coronavirus S genes 
are widely known* One having ordinary skill in the art may 
routinely obtain DNA that encodes a polypeptide including a 

10 UCD using standard techniques and widely available starting 
materials. The nucleotide and amino acid sequences for S 
proteins from several types and strains of coronaviruses can 
be found in the co-owned published PCT application 
PCT/US91/ 08525 which claims priority to U.S. Patent 

15 Application Serial Numbers 613,066 and 698,927; each of these 
applications are incorporated herein by reference. Nucleotide 
and amino acid sequences of S proteins can also be found in 
published European Patent Applications publication numbers: 
0,524,672 Al; 0,411,684 A2; 0,264,979 Al; 0,138,242 Al; and 

20 application number EP 91 30 3737. Each of these European 
patent applications are incorporated herein by reference. In 
addition, nucleotide and amino acid sequences of S proteins 
from several coronaviruses as well as nucleotide and amino 
acid sequences of a consensus sequence is disclosed in the co- 

25 owned, co-pending patent application: which is filed on the 
same day as the present application; which is entitled 
"Compositions and Methods for Vaccinating Coronaviruses"; 
which names the same inventors as the present application 
(Miller, Timothy J.; Jones, Elaine V.; Reed, Albert P.; and 

30 Klepfer, Sharon R) ; which has been designated docket number 
H85009-1 by Applicants; and which is incorporated herein by 
reference. 

Nucleic acid molecules encoding some or all of an 
S protein from a coronavirus may be generated by a variety of 
35 techniques. For such molecules, a nucleotide sequence that 
encodes a UCD may be identified. Using, for example, 
Polymerase Chain Reaction (PCR) methodology, primers flanking 
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both sides the region of interest may be designed and used, to 
produce multiple copies of the UCD routinely. Alternatively, 
using restriction enzymes, a UCD may be isolated from DNA 
encoding an S protein* Moreover, nucleic acid molecules that 
5 encode a UCD may also be synthesized using techniques well 
known to those having ordinary skill in the art. 

One having ordinary skill in the art can, using well 
known techniques, insert such DNA molecules into a 
commercially available expression vector for use in well known 

10 expression systems. For example, the commercially available 
plasmid pSE420 (Invitrogen, San Diego, CA) may be used for 
production of a DNA encoding a polypeptide including a UCD in 
E. coli. The commercially available plasmid pYES2 

(Invitrogen, San Diego, CA) may, for example, be used for 

15 production in S. cerevisiae strains of yeast. The 
commercially available MaxBac™ (Invitrogen, San Diego, CA) 
complete baculovirus expression system may, for example, be 
used for production in insect cells. The commercially 
available plasmid pcDNA I (Invitrogen, San Diego, CA) may, for 

20 example, be used for production in mammalian cells such as 
Chinese Hamster Ovary cells. One having ordinary skill in the 
art can use these commercial expression vectors and systems 
or others to produce a polypeptide including a UCD using 
routine techniques and readily available starting materials. 

25 (See e.g., Sambrook et al., Molecular Cloning a Laboratory 
Manual, Second Ed. Cold Spring Harbor Press (1989) which is 
incorporated herein by reference.) Thus, the desired proteins 
can be prepared in both prokaryotic and eukaryotic systems, 
resulting in a spectrum of processed forms of the protein. 

30 The particulars for the construction of expression 

systems suitable for desired hosts are known to those in the 
art. Briefly, for recombinant production of the protein, the 
DNA encoding the polypeptide is suitably ligated into the 
expression vector of choice. The DNA is operably linked to 

35 all regulatory elements which are necessary for expression of 
the DNA in the selected host. One having ordinary skill in 
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the art can, using well known techniques, prepare expression 
vectors for recombinant production of the polypeptide. 

The expression vector including the DNA that encodes 
the polypeptide comprising a UCD is used to transform the 
5 compatible host which is then cultured and maintained under 
conditions wherein expression of the foreign DNA takes place. 
The protein of the present invention thus produced is 
recovered from the culture, either by lysing the cells or from 
the culture medium as appropriate and known to those in the 

10 art. One having ordinary skill in the art can, using well 
known techniques, isolate the polypeptide that includes a UCD 
produced using such expression systems. 

In addition to producing these proteins by 
recombinant techniques, automated peptide synthesizers may 

15 also be employed to produce polypeptides that include a UCD. 
Such techniques are well known to those having ordinary skill 
in the art and are useful if derivatives which have 
substitutions not provided for in DNA-encoded protein 
production* 

20 Subunit vaccines according to the invention comprise 

a polypeptide the includes a UCD but which is not a complete 
S protein and a pharmaceutical ly acceptable carrier or 
diluent. Optionally, the vaccine may comprise additional 
immunogenic proteins, additional vaccine components such as 

25 non-subunit vaccines, and/or an adjuvant. 

In nucleic acid molecule based, i.e. recombinant 
vaccines, a nucleotide sequences which encode polypeptides 
that include a UCD is inserted into a vector and administered 
to the animal. The vector delivers genetic material to the 

30 animal where it is transcribed and translated to produce the 
immunogenic polypeptide. Vectors for use as vaccines are well 
known and include non-pathogenic viruses and prokaryotic 
organisms. Suitable vectors for delivering genetic material 
are readily available or may be produced from readily 

35 available starting materials using standard techniques. Two 
examples of vectors useful for delivering genetic material as 
a vaccine are the recombinant pox vectors or non-pathogenic 
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Salmonella strains. The nucleotide sequence that encodes the 
immunogenic polypeptide is operably linked to regulatory 
elements required for expression and inserted within the 
vector. Alternatively, it is incorporated into the vector at 
5 a site where it is placed under the control of the necessary 
regulatory elements already present in the vector* Naked DNA 
may also be used as a vaccine delivery system. 

Recombinant vaccines may be used in combination with 
other vaccines. Further, the genetic material which encodes 

10 the polypeptide that comprises the UCD may further comprise 
additional coding sequences which encode other peptide 
sequences capable of eliciting an immunogenic response against 
coronavirus or another pathogen. 

Both subunit and recombinant vaccines may be 

15 formulated following accepted convention using buffers, 
stabilizers, preservative, solubilizers and compositions used 
to facilitate sustained release. Generally, additives for 
isotonicity can include sodium chloride, dextrose, mannitol, 
sorbitol and lactose. Stabilizers include gelatin and 

20 albumin. Adjuvants such as aluminum or magnesium hydroxide 
may be employed. Vaccines may be maintained in solution or, 
in some cases, particularly recombinant vaccines, lyophilized. 
Lyophilized vaccine may be stored conveniently and combined 
with sterile solution before administration. 

25 The amount of polypeptide administered depends upon 

such factors as the size of the polypeptide, the species, age, 
weight, and general physical characteristics of the animal, 
and by the composition of the vaccine. Determination of 
optimum dosage for each parameter may be made by routine 

30 methods. Generally, subunit vaccines according to the present 
invention contain between 0.05-5000 micrograms of polypeptide 
per milliliter of sterile solution, preferably 10-1000 
micrograms. Generally, recombinant vaccines according to the 
present invention contain between 10 5 -10 B infectious units per 

35 milliliter of sterile solution. About .5-2 milliliter of 
polypeptide-containing solution is administered. 
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Subunit vaccines and genetic material based vaccines 
may be administered by an appropriate route such as, for 
example, by oral, intranasal, intramuscular, intraperitoneal 
or subcutaneous administration. In some embodiments, 
5 intranasal or subcutaneous administration is preferred. 
Subsequent to initial vaccination, animals may be boosted by 
revaccination . 

Examples 

Example 1 Cloning of Coronavirus Conserved Region in pMGl 
10 The bacterial expression vector, pMG-1, allows a 

gene expressing a foreign protein to be fused to a partial 
sequence of the NS1 gene from influenza virus, the first 81 
encoding amino acids thereof . This vector is described in 
European Patent Application No. 366,238, published May 2, 
15 1990, which is incorporated herein by reference. 

Primers were designed to amplify a S gene region 
encoding amino acids 1115-1238 of the DF2 FIPV strain for 
expression in this vector as follows. The upstream primer 
contains Ncol and Ndel restriction sites and initiates 
20 amplification at base pair 3406 (amino acid 1115) , and is SEQ 
ID NO: 13: 

GTTGTCAACACZ ICCATGGATCATATG CAAGGGCAAGCTTTAAGTCACCTTA^ . 
Nco l Nde l 

25 The downstream primer contains a StuI site and terminates 
amplification at base pair 3777 (amino acid 1238), and is SEQ 
ID NO: 14: 

5 r -AAATACCT GAGGCCT CCAAGCTGTTACAGTTTCATAAGCTGT . 
StuI 

30 The amplified fragment (412 bp) was cloned into the pT 7 Blue 
vector according to the manuf acturer r s instructions . A 
plasmid containing amino acids 1115—1238 in pT 7 Blue was 
digested with Ncol / StuI , the 412 base pair insert isolated , 
and ligated overnight at 15 °C to plasmid vector pMGl digested 

35 with Ncol / Stu I and dephosphorylated. Host cells AR120 and 
AR58 were transformed with the ligation mix and the presence 
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of insert bearing clones was confirmed by diagnostic 
restriction enzyme digestions. 

Example 2 - Cloning of Coronavirus Conserved Region in pSCll 
5 Vaccinia recombinants were engineered to contain the 

1115-1238 amino acid conserved region of WT DF2 FIPV. The 
conserved region was cloned into the vaccinia expression 
vector pSCll by blunt-ending the 412 base pairs Ncol/Stul 
fragment isolated from the pT7 Blue clone described in Example 

10 12, end-filling by incubation with Klenow polymerase, and 
inserting it into the Smal site downstream of the 7.5K 
vaccinia promoter. The ligation mix was transformed into 
HB101 host cells* Full-length clones were identified and 
oriented with respect to vector by BamHI and Sea l digests of 

15 mini-prep DNAs, respectively, 
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Waue2 
D£2e2 
5 Tse2 
Fecve2 
Tgeve2 
Tgeve2f2 
Bcve2 
10 Hcve2 
Ibbspi 
Mhve2a59 
Mhvs 

CONSENSUS 

15 

Wsue2 

Df2e2 

Tse2 

Fecve2 
20 Tgeve2 

Tgeve2f2 

Bcve2 

Hcve2 

Ibbepi 
25 Mhve2a59 

Mhvs 

CONSENSUS 



Wsue2 

30 Df2e2 
Tse2 
Fecve2 
Tgeve2 
Tgeve2f2 

35 Bcve2 
Hcve2 
Ibbspi 
Mhve2a59 
Mhvs 

40 CONSENSUS 



Wsue2 

Df2e2 

Tse2 
45 Fecve2 

Tgeve2 

Tgeve2f2 

Bcve2 

Hcve2 
50 Ibbspi 

Khve2a59 

Mhvs 

CONSENSUS 



Table 1 

l 

RITQAFGKVN DAIHQTSQGL ATVAKALAKV 
NITQAFGKVN DAIHQTSQGL ATVAKALAKV 
NITQAFGKVN DAIHQTSQGL ATVAKALAKV 
NITQAFGKVN DAIHQTSQGL ATVAKALAKV 
NITQAFGKVN DAIHQTSQGL ATVAKALAKV 
NITQAFGKVN DAIHQTSQGL ATVAKALAKV 

AIQEGFDATN S ALVKI 

NIVDAFTGVN DAITQTSQAL QTVATALNKI 

HMQE GF RSTSLALQQI 

AIQDGFDATN S ALGKI 

AIQEGFDATN S • . « ALGKI 

NITQAFGKVN DAIHQTSQGL ATVAKALAKV 



51 

NFQAISSSIS 
NFQAISSSXS 
NFQAISSSIS 
NFQAISSSIS 
NFQAISSSIS 
NFQAISSSIS 
RFGAISSSLQ 
NFQAISSSIQ 
NFGAISSVIQ 
RFGAISASLQ 
HFGAISASLQ 
NFQAISSSIS 

101 

RASRQLAKDK 
RASRQLAKDK 
RASRQLAKDK 
RASRQLAKDK 
RASRQLAKDK 
RASRQLAKDK 
KFSAAQAHEK 
RASRQLAQQK 
SQQRELATQK 
KVSAAQAIEK 
KFSAAQAIEK 
RASRQLAKDK 



151 

PTAYETVTAW 
PTAYETVTAW 
PTAYETVTAW 
PTAYETVTAW 
PTAYETVTAW 
PTAYETVTAW 
PTKYVTAKYS 
PTQYKDVEAW 
PDSFVNVTAI 
PISFTTANVS 
PTSFKTANVS 
PTAYETVTAW 



DIYNRLDELS 
DIYNRLDELS 
DIYNRLDELS 
DIYNRLDELS 
DIYNRLDELS 
DIYNRLDELS 
EILSRLDALE 
AIYDRLDTIQ 
EIUQQFDAIQ 
EILTRLEAVE 
EILTRLDAVE 
DIYNRLDELS 



VNECVRSQSQ 
VNECVRSQSQ 
VNECVRSQSQ 
VNECVRSQSQ 
VNECVRSQSQ 
VNECVRSQSQ 
VNECVKSQSS 
VNECVKSQSK 
INECVKSQSI 
VNECVKSQTT 
VNECVKSQTT 
VNECVRSQSQ 



SGICASDGDR 
SGICASDGDR 
SGICASDGDR 
SGICASDGDR 
SGICASDGDR 
SGICASDGDR 
PGLCIA.GDR 
SGLC. . . VDG 
VGFCVKPANA 
PGLCIS.GDR 
PGLCIS.GDR 
PGICASDGDR 



ADAQVDRLIT 
ADAQVDRLIT 
ADAQVDRLIT 
ADAQVDRLIT 
ADAQVDRLIT 
ADAQVDRLIT 
AQAQIDRLIN 
ADQQVDRLIT 
ANAQVDRLIT 
AKAQIDRLIN 
AKAQIDRLIN 
ADAQVDRLIT 



RFGFCGNGTH 
RFGFCGNGTH 
RFGFCGNGTH 
RFGFCGNGTH 
RFGFCGNGTH 
RFGFCGNGTH 
RINFCGNGNH 
RYGFCGNGTH 
RYSFCGNGRH 
RINFCGNGNH 
RINFCGNGNH 
RFGFCGNGTH 



TFGLWKDVQ 
TFGLWKDVQ 
TFGLWKDVQ 
TFGLWKDVQ 
TFGLWKDVQ 
TFGLWKDVQ 

GIA PK 

TNGYVLRQPN 
SQUAIVPANG 

GLA PK 

GLA PK 

TFGLWKDVQ 



QDWNTQGQA 
QDWNTQGQA 
QDWNTQGQA 
QDWNTQGQA 
QDWNTQGQA 
QDWNTQGQA 
QAWNANAEA 
QDWNQQGNS 
QDWSKQSAI 
QSWNANAEA 
QSWNANAEA 
QDWNTQGQA 



GRLTALNAFV 
GRLTALNAFV 
GRLTALNAFV 
GRLTALNAFV 
GRLTALNAFV 
GRLTALNAFV 
GRLTALNVYV 
GRLAALNVFV 
GRLSSLSVLA 
GRLTALNAYI 
GRLTALNAYI 
GRLTALNAFV 



LFSLANAAPN 
LFSLANAAPN 
LFSLANAAPN 
LFSLANAAPN 
LFSLANAAPN 
LFSLANAAPN 
IISLVQNAPY 
IFSIVNAAPB 
VLTIPQNAPN 
ILSLVQNAPY 
ILSLVQNAPY 
LFSLANAAPN 



LTLFRNLDDK 
LTLFRNLDDK 
LTLFRNLDDK 
LTLFRNLDDK 
LTLFRNLDDK 
LTLFRNLDDK 
SGYFVNVNNT 
LALYKE.GNY 
RGIFIQVNGS 
AGYFVQDDGE 
AGYFVQDNGE 
LTLFRNLDDK 



50 

LSHLTVQLQN 
LSHLTVQLQN 
LSHLTVQLQN 
LSHLTVQLQN 
LSHLTVQLQN 
LSHLTVQLQN 
LNNLLQQLSN 
LNHLTSQLRQ 
LTETMASLNK 
LNNLLNQLSN 
LNNLLNQLSN 
LSHLTVQLGN 

100 

SQTLTRQAEV 
SQTLTRQAEV 
SQTLTRQAEV 
SQTLTRQAEV 
SQTLTRQAEV 
SQTLTRQAEV 
SQQLSDSTLV 
SHTLTKYTEV 
SAKQAEUIRV 
SKQLSDSTLI 
SKQLSDSTLI 
SQTLTRQAEV 

150 

GHIFFHTVLL 
GMIFFHTVLL 
GHIFFHTVLL 
GHIFFHTVLL 
GHIFFHTVLL 
GMIFFHTVLL 
GLYFIHFSYV 
GLVFLHTVLL 
GIVFIHFSYT 
GLYFIHFSYV 
GLCFIHFSYV 
GHIFFHTVLL 

200 

FYLTPRTMYQ 
FYLTPRTMYQ 
FYLTPRTMYQ 
FYLTPRTMYQ 
FYLTPRTMYQ 
FYLTPRTMYQ 
WMFTGSGYYY 
YRITSRIMFE 
YYITARDMYM 
WKFTGSSYYY 
WKFTGSNYYY 
FYLTPRTMYQ 
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SEQUENCE LISTING 



(1.) GENERAL INFORMATION: 

(i) APPLICANT: Miller, Timothy J. 

Jones, Elaine V. 
Reed, Albert P. 
Klepfer, Sharon R. 

(ii) TITLE OF INVENTION: Universal Coronavirus Vaccine 
(iii) NUMBER OF SEQUENCES: 14 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE : SmithKline Beecham Corporation 

(B) STREET: 709 Swedeland Road 

(C) CITY: King of Prussia 

(D) STATE: PA 

(E) COUNTRY: USA 

(F) ZIP: 19406-2799 



COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 
<C> OPERATING SYSTEM: PC-DOS/MS-DOS 
<D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 
<B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/882,171 

(B) FILING DATE: 08-MAY-1992 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/698,927 

(B) FILING DATE: 13-MAY-1991 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/613,066 

(B) FILING DATE: 14-NOV-1990 

(viii) ATTORNEY / AGENT INFORMATION: 

(A) NAME: Schreck, Patricia A. 

(B) REGISTRATION NUMBER: 33,777 

(C) REFERENCE /DOCKET NUMBER: SBC/PAS/WW001 



(v) 



20 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 200 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



45 (xi) SEQUENCE DESCRIPTION : SEQ ID NO:l: 

Asn He Thr Gin Ala Phe Gly Lys Val Asn Asp Ala He His Gin Thr 
15 10 is 
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Ser Gin Gly Leu Ala Thr Val Ala Lys Ala Leu Ala Lys Val Gin Asp. 
20 25 30 

Val Val Asn Thr Gin Gly Gin Ala Leu Ser His Leu Thr Val Gin Leu 
35 40 45 

5 Gin Asn Asn Phe Gin Ala Xle Ser Ser Ser lie Ser Asp lie Tyr Asn 

50 55 60 

Arg Leu Asp Glu Leu Ser Ala Asp Ala Gin Val Asp Arg Leu Xle Thr * 
65 70 75 80 

Gly Arg Leu Thr Ala Leu Asn Ala Phe Val Ser Gin Thr Leu Thr Arg 
10 85 90 95 

Gin Ala Glu Val Arg Ala Ser Arg Gin Leu Ala Lys Asp Lys Val Asn 
100 105 110 

Glu Cys Val Arg Ser Gin Ser Gin Arg Phe Gly Phe Cys Gly Asn Gly 
115 120 125 

15 Thr His Leu Phe Ser Leu Ala Asn Ala Ala Pro Asn Gly Met Xle Phe 

130 135 140 

Phe His Thr Val Leu Leu Pro Thr Ala Tyr Glu Thr Val Thr Ala Trp 
145 150 155 160 

Ser Gly Xle Cys Ala Ser Asp Gly Asp Arg Thr Phe Gly Leu Val Val 
20 165 170 175 

Lys Asp Val Gin Leu Thr Leu Phe Arg Asn Leu Asp Asp Lys Phe Tyr 
180 185 190 

Leu Thr Pro Arg Thr Met Tyr Gin 
195 200 

25 (2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 200 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY t linear 

30 (iiy MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION i SEQ ID NO: 2: 

Asn lie Thr Gin Ala Phe Gly Lys Val Asn Asp Ala He His Gin Thr 
15 10 15 

Ser Gin Gly Leu Ala Thr Val Ala Lys Ala Leu Ala Lys Val Gin Asp 
35 20 25 30 

Val Val Asn Thr Gin Gly Gin Ala Leu Ser His Leu Thr Val Gin Leu 
35 40 45 

Gin Asn Asn Phe Gin Ala lie Ser Ser Ser lie Ser Asp He Tyr Asn 
50 55 60 

40 Arg Leu Asp Glu Leu Ser Ala Asp Ala Gin Val Asp Arg Leu lie Thr 

65 70 75 80 



Gly Arg Leu Thr Ala Leu Asn Ala Phe Val Ser Gin Thr Leu Thr Arg 
85 90 95 
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Gin Ala Glu Val Arg Ala Ser Arg Gin Leu Ala Lye Asp Lvs Val Asn 
100 105 iio 

Glu Cya Val Arg Ser Gin Ser Gin Arg Phe Gly Phe Cys Gly Asn Gly 
115 120 125 

5 Thr His Leu Phe Ser Leu Ala Asn Ala Ala Pro Asn Gly Met He Phe 

130 135 140 

Phe His Thr Val Leu Leu Pro Thr Ala Tyr Glu Thr Val Thr Ala Trp 
145 150 155 160 

Ser Gly He Cys Ala Ser Asp Gly Asp Arg Thr Phe Gly Leu Val Val 
10 165 170 175 

Lys Asp Val Gin Leu Thr Leu Phe Arg Asn Leu Asp Asp Lys Phe Tyr 
180 185 190 

Leu Thr Pro Arg Thr Met Tyr Gin 
195 200 

15 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 200 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: protein 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Asn lie Thr Gin Ala Phe Gly Lys Val Asn Asp Ala lie His Gin Thr 
1 5 10 15 

Ser Gin Gly Leu Ala Thr Val Ala Lys Ala Leu Ala Lys Val Gin Asp 
25 20 25 30 

Val Val Asn Thr Gin Gly Gin Ala Leu Ser His Leu Thr Val Gin Leu 
35 40 45 

Gin Asn Asn Phe Gin Ala He Ser Ser Ser He Ser Asp He Tyr Asn 
50 55 60 

Arg Leu Asp Glu Leu Ser Ala Asp Ala Gin Val Asp Arg Leu He Thr 
65 70 75 80 

Gly Arg Leu Thr Ala Leu Asn Ala Phe Val Ser Gin Thr Leu Thr Arg 
85 90 95 

Gin Ala Glu Val Arg Ala Ser Arg Gin Leu Ala Lys Asp Lys Val Asn 
35 100 105 HO 

Glu Cys Val Arg Ser Gin Ser Gin Arg Phe Gly Phe Cys Gly Asn Glv 
115 120 125 

Thr His Leu Phe Ser Leu Ala Asn Ala Ala Pro Asn Gly Met He Phe 
130 135 140 

40 Phe His Thr Val Leu Leu Pro Thr Ala Tyr Glu Thr Val Thr Ala Trp 

i45 150 155 160 

Ser Gly He Cys Ala Ser Asp Gly Asp Arg Thr Phe Gly Leu Val Val 
165 170 175 
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Lys Asp val Gin Leu Thr Leu Phe Arg Asn Leu Asp Asp Lys Phe Tyr 
180 185 190 

Leu Thr Pro Arg Thr Met Tyr Gin 
195 200 

5 (2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 200 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Asn lie Thr Gin Ala Phe Gly Lys Val Asn Asp Ala He His Gin Thr 
1 5 io 



15 



Ser Gin Gly Leu Ala Thr Val . Ala Lys Ala Leu Ala Lys Val Gin Asp 
20 25 30 

Val Val Asn Thr Gin Gly Gin Ala Leu Ser His Leu Thr Val Gin Leu 
35 40 45 

Gin Asn Asn Phe Gin Ala He Ser Ser Ser He Ser Asp He Tyr Asn 
b ° 55 60 

20 Arg Leu Asp Glu Leu Ser Ala Asp Ala Gin Val Asp Arg Leu He Thr 

b5 70 75 80 

Gly Arg Leu Thr Ala Leu Asn Ala Phe Val Ser Gin Thr Leu Thr Arcr 
85 90 95 

Gin Ala Glu Val Arg Ala Ser Arg Gin Leu Ala Lys Asp Lys Val Asn 
ZD 1°° 105 no 

Glu Cys Val Arg Ser Gin Ser Gin Arg Phe Gly Phe Cys Gly Asn Gly 
H5 120 125 

Thr His Leu Phe Ser Leu Ala Asn Ala Ala Pro Asn Gly Met He Phe 
130 135 14Q 

30 Phe His Thr Val Leu Leu Pro Thr Ala Tyr Glu Thr Val Thr Ala Trp 

145 ISO 155 160 

Ser Gly He Cys Ala Ser Asp Gly Asp Arg Thr Phe Gly Leu Val Val 
165 i7o 175 

Lys Asp Val Gin Leu Thr Leu Phe Arg Asn Leu Asp Asp Lys Phe Tyr 
" 180 . 185 190 

Leu Thr Pro Arg Thr Met Tyr Gin 
195 200 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 200 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Asn He Thr Gin Ala Phe Gly Lys Val Asn Asp Ala He His Gin Thr 
1 5 10 15 

Ser Gin Gly Leu Ala Thr Val Ala Lys Ala Leu Ala Lys Val Gin Asp 
20 25 30 

Val Val Asn Thr Gin Gly Gin Ala Leu Ser His Leu Thr Val Gin Leu 
35 40 45 

Gin Asn Asn Phe Gin Ala He Ser Ser Ser He Ser Asp He Tyr Asn 
SO 55 60 

Arg Leu Asp Glu Leu Ser Ala Asp Ala Gin Val Asp Arg Leu He Thr 
65 70 75 80 

Gly Arg Leu Thr Ala Leu Asn Ala Phe Val Ser Gin Thr Leu Thr Arg 
85 90 95 

Gin Ala Glu Val Arg Ala Ser Arg Gin Leu Ala Lys Asp Lys Val Asn 
15 100 105 HO 

Glu Cys Val Arg Ser Gin Ser Gin Arg Phe Gly Phe Cys Gly Asn Gly 
115 120 125 

Thr His Leu Phe Ser Leu Ala Asn Ala Ala Pro Asn Gly Met He Phe 
130 135 140 

20 phe His Thr Val Leu Leu Pro Thr Ala Tyr Glu Thr Val Thr Ala Trp 

145 150 155 160 

Ser Gly He Cys Ala Ser Asp Gly Asp Arg Thr Phe Gly Leu Val Val 
165 170 175 

LyB Asp Val Gin Leu Thr Leu Phe Arg Asn Leu Asp Asp Lys Phe Tvr 
25 180 185 190 

Leu Thr Pro Arg Thr Met Tyr Gin 
195 200 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 200 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Asn He Thr Gin Ala Phe Gly Lys Val Asn Asp Ala He His Gin Thr 
1 5 10 is 

Ser Gin Gly Leu Ala Thr Val Ala Lys Ala Leu Ala Lys Val Gin Asp 

20 25 30 

Val Val Asn Thr Gin Gly Gin Ala Leu Ser His Leu Thr Val Gin Leu 
35 40 45 

Gin Asn Asn Phe Gin Ala He Ser Ser Ser He Ser Asp He Tyr Asn 
50 55 60 
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Arg Leu Asp Glu Leu Ser Ala Asp Ala Gin Val Asp Arg Leu He Thr 
65 70 75 80 

Gly Arg Leu Thr Ala Leu Asa Ala Phe Val Ser Gin Thr Leu Thr Arg 
85 90 95 

5 Gin Ala Glu Val Arg Ala Ser Arg Gin Leu Ala Lys Asp Lys Val Asn 

100 105 110 

Glu Cys Val Arg Ser Gin Ser Gin Arg Phe Gly Phe Cys Gly Asn Gly 
115 120 125 

Thr His Leu Phe Ser Leu Ala Asn Ala Ala Pro Asn Gly Met lie Phe 
10 130 135 140 

Phe His Thr Val Leu Leu Pro Thr Ala Tyr Glu Thr Val Thr Ala Trp 
145 150 155 160 

Ser Gly lie Cys Ala Ser Asp Gly Asp Arg Thr Phe Gly Leu Val Val 
165 170 175 

15 Lys Asp Val Gin Leu Thr Leu Phe Arg Asn Leu Asp Asp Lys phe Tyr 

180 185 190 

Leu Thr Pro Arg Thr Met Tyr Gin 
195 200 

(2) INFORMATION FOR SEQ ID NO: 7: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 179 amino acids 

(B) TYPEs amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TOPE: protein 



25 <xi) SEQUENCE DESCRIPTION : SEQ ID NO:7: 

Ala lie Gin Glu Gly Phe Asp Ala Thr Asn Ser Ala Leu Val Lys lie 
15 10 15 

Gin Ala Val Val Asn Ala Asn Ala Glu Ala Leu Asn Asn Leu Leu Gin 
20 25 30 

30 Gin Leu Ser Asn Arg Phe Gly Ala lie Ser Ser Ser Leu Gin Glu lie 

35 40 45 

Leu Ser Arg Leu Asp Ala Leu Glu Ala Gin Ala Gin lie Asp Arg Leu 
50 55 60 

He Asn Gly Arg Leu Thr Ala Leu Asn Val Tyr Val Ser Gin Gin Leu 
35 65 70 75 80 

Ser Asp Ser Thr Leu Val Lys Phe Ser Ala Ala Gin Ala Met Glu Lys 
85 90 95 

Val Asn Glu Cys Val Lys Ser Gin Ser Ser Arg He Asn Phe Gly Asn 
100 105 110 

40 Gly Asn His He lie Ser Leu Val Gin Asn Ala Pro Tyr Gly Leu Tyr 

115 120 125 

Phe He His Phe Ser Tyr Val Pro Thr Lys Tyr Val Thr Ala Lys Tyr 
130 135 140 
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Ser Pro Gly Leu Cys lie Ala Gly Asp Arg Gly lie Ala Pro Lys Ser- 
ies 150 155 160 

Gly Tyr Phe Val Asn Val Asn Asn Thr Trp Met Phe Thr Gly Ser Gly 
165 170 175 

5 Tyr Tyr Tyr 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 196 amino acids 
10 (B) TYPEs amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Asn lie Val Asp Ala Phe Thr Gly Val Asn Asp Ala He Thr Gin Thr 
15 1 5 10 15 

Ser Gin Ala Leu Gin Thr Val Ala Thr Ala Leu Asn Lys He Gin Asp 
20 25 30 

Val Val Asn Gin Gin Gly Aan Ser Leu Asn His Leu Thr Ser Gin Leu 
35 40 45 

Arg Gin Asn Phe Gin Ala He Ser Ser Ser He Gin Ala He Tyr Asp 
50 55 60 

Arg Leu Asp Thr He Gin Ala Asp Gin Gin Val Asp Arg Leu He Thr 
65 70 75 80 

Gly Arg Leu Ala Ala Leu Asn Val Phe Val Ser His Thr Leu Thr Lys 
25 85 90 95 

Tyr Thr Glu Val Arg Ala Ser Arg Gin Leu Ala Gin Gin Lys Val Asn 
100 105 HO 

Glu Cys Val Lys Ser Gin Ser Lys Arg Tyr Gly Phe Cys Gly Asn Gly 
115 120 125 

30 Thr His He Phe Ser He Val Asn Ala Ala Pro Glu Gly Leu Val Phe 

130 135 140 

Leu His Thr Val Leu Leu Pro Thr Gin Tyr Lys Asp Val Glu Ala Trp 
14 5 150 155 160 

Ser Gly Leu Cys Val Asp Gly Thr Asn Gly Tyr Val Leu Arg Gin Pro 
35 165 170 175 

Asn Leu Ala Leu Tyr Lys Glu Gly Asn Tyr Tyr Arg He Thr Ser Arg 
180 185 190 

He Met Phe Glu 
195 



20 



40 (2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 183 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

His Met Gin Glu Gly Phe Arg Ser Thr Ser Leu Ala Leu Gin Gin He 
1 5 10 15 

5 Gin Asp Val Val Ser Lys Gin Ser Ala He Leu Thr Glu Thr Met Ala 

20 25 30 

Ser Leu Asn Lys Asn Phe Gly Ala He Ser Ser Val lie Gin Glu He 
35 40 45 

Gin Gin Phe Asp Ala He Gin Ala Asn Ala Gin Val Asp Arg Leu He 
10 50 55 60 

Thr Gly Arg Leu Ser Ser Leu Ser Val Leu Ala Ser Ala Lys Gin Ala 
65 70 75 80 

Glu He Arg Val Ser Gin Gin Arg Glu Leu Ala Thr Gin Lys He Asn 
85 90 95 

15 Glu Cys Val Lys Ser Gin Ser He Arg Tyr Ser Phe Cys Gly Asn Gly 

100 105 110 

Arg His Val Leu Thr He Pro Gin Asn Ala Pro Asn Gly He Val Phe 
H5 120 125 

He His Phe Ser Tyr Thr Pro Asp Ser Phe Val Asn Val Thr Ala He 
20 130 135 140 

Val Gly Phe Cys Val Lys Pro Ala Asn Ala Ser Gin Ala He Val Pro 
145 150 155 160 

Ala Asn Gly Arg Gly He Phe He Gin Val Asn Gly Ser Tyr Tyr He 
165 170 175 

25 Thr Ala Arg Asp Met Tyr Met 

180 

(2) INFORMATION FOR SEQ ID NOrlO; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 amino acids 
30 (B) TYPE: amino acid 

(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Ala He Gin Asp Gly Phe Asp Ala Thr Asn Ser Ala Leu Gly Lys lie 
35 1 5 10 15 

Gin Ser Val Val Asn Ala Asn Ala Glu Ala Leu Asn Asn Leu Leu Asn 
20 25 30 

Gin Leu Ser Asn Arg Phe Gly Ala He Ser Ala Ser Leu Gin Glu He 
35 40 45 

40 Leu Thr Arg Leu Glu Ala Val Glu Ala Lys Ala Gin He Asp Arg Leu 

50 55 60 
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lie Abu Gly Arg Leu Thr Ala Leu Asn Ala Tyr lie Ser Lys Gin Leu 
65 70 75 80 

Ser Asp Ser Thr Leu He Lys Val Ser Ala Ala Gin Ala He Glu Lys 
85 90 95 

5 Val Asn Glu Cys Val Lys Ser Gin Thr Thr Arg He Asn Phe Cys Gly 

100 105 no 

Asn Gly Asn His He Leu Ser Leu Val Gin Asn Ala Pro Tyr Gly Leu 
115 120 125 

Tyr Phe He His Phe Ser Tyr Val Pro He Ser Phe Thr Thr Ala Asn 
xu "0 135 140 

Val Ser Pro Gly Leu Cys He Ser Gly Asp Arg Gly Leu Ala Pro Lys 
145 150 155 160 

Ala Gly Tyr Phe Val Gin Asp Asp Gly Glu Trp Lys Phe Thr Gly Ser 
165 170 175 

15 Ser Tyr Tyr Tyr 

180 

(2) INFORMATION FOR SEQ ID NO: 11: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Ala He Gin Glu Gly Phe Asp Ala Thr Asn Ser Ala Leu Gly Lys He 
" 1 5 10 if 

Gin Ser Val Val Asn Ala Asn Ala Glu Ala Leu Asn Asn Leu Leu Asn 
20 25 30 

Gin Leu Ser Asn Arg Phe Gly Ala He Ser Ala Ser Leu Gin Glu He 
35 40 45 

30 Leu Thr Arg Leu Asp Ala Val Glu Ala Lys Ala Gin He Asp Arg Leu 

50 55 60 

He Asn Gly Arg Leu Thr Ala Leu Asn Ala Tyr He Ser Lys Gin Leu 
65 70 75 80 

Ser Asp Ser Thr Leu He Lys Phe Ser Ala Ala Gin Ala He Glu Lys 
85 90 95 

Val Asn Glu Cys Val Lys Ser Gin Thr Thr Arg He Asn Phe Cys Gly 
100 105 no 

Asn Gly Asn His He Leu Ser Leu Val Gin Asn Ala Pro Tyr Gly Leu 
115 120 125 

Cys Phe He His Phe Ser Tyr Val Pro Thr Ser Phe Lys Thr Ala Asn 
130 135 140 

Val Ser Pro Gly Leu Cys He Ser Gly Asp Arg Gly Leu Ala Pro Lys 
145 150 155 160 



40 
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Ala Gly Tyr Phe Val Gin Asp Asn Gly Glu Trp Lys Phe Thr Gly Ser 
165 170 175 

Asn Tyr Tyr Tyr 
180 

5 (2> INFORMATION FOR SEQ ID NO: 12: 

(1) SEQUENCE CHARACTERISTICS t 

<A) LENGTH: 199 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Asn He Thr Gin Ala Phe Gly Lys Val Asn Asp Ala He His Gin Thr 
1 5 10 15 

Ser Gly Leu Ala Thr Val Ala Lys Ala Leu Ala Lys Val Gin Asp Val 
15 20 25 30 

Val Asn Thr Gin Gly Gin Ala Leu Ser His Leu Thr Val Gin Leu Gly 
35 40 45 

Asn Asn Phe Gin Ala lie Ser Ser Ser He Ser Asp He Tyr Asn Arg 
50 55 60 

20 Leu Asp Glu Leu Ser Ala Asp Ala Gin Val Asp Arg Leu lie Thr Gly 

65 70 75 80 

Arg Leu Thr Ala Leu Asn Ala Phe Val Ser Gin Thr Leu Thr Arg Gin 
85 90 95 

Ala Glu Val Arg Ala Ser Arg Gin Leu Ala Lys Asp Lys Val Asn Glu 
25 100 105 110 

Cys Val Arg Ser Gin Ser Gin Arg Phe Gly Phe Cys Gly Asn Gly Thr 
115 120 125 

His Leu Phe Ser Leu Ala Asn Ala Ala Pro Asn Gly Met He Phe Phe 
130 135 140 

30 His Thr Val Leu Leu Pro Thr Ala Tyr Glu Thr Val Thr Ala Trp Pro 

145 150 155 160 

Gly He Cys Ala Ser Asp Gly Asp Arg Thr Phe Gly Leu Val Val Lys 
165 170 175 

Asp Val Gin Leu Thr Leu Phe Arg Asn Leu Asp Asp Lys Phe Tyr Leu 
35 180 185 190 

Thr Pro Arg Thr . Met Tyr Gin 
195 

(2) INFORMATION FOR SEQ ID N0:13r 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 53 base pairs . 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GTTGTCAACA CACCATGGAT CATATGCAAG GGCAAGCTTT AAGTCACCTT ACA 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE. TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
AAATACCTGA GGCCTCCAAG CTGTTACAGT TTCATAAGCT GT 
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Claims 

A polypeptide comprising a universal conserved 
domain of a coronavirus or an immunogenic fragment or 
derivative thereof; said polypeptide having less than a 
5 complete amino acid sequence of said S protein. 

2 - A vaccine comprising a pharmaceutical^ acceptable 
carrier or diluent and a polypeptide comprising a universal 
conserved domain of a coronavirus or an immunogenic fragment 
or derivative thereof; said polypeptide having less than a 

10 complete amino acid sequence of said S protein. 

3- A nucleic acid molecule comprising a nucleotide 
sequence that encodes a polypeptide comprising a universal 
conserved domain of a coronavirus or an immunogenic fragment 
or derivative thereof; said polypeptide having less than a 

15 complete amino acid sequence of said S protein. 

4 * A recombinant vaccine comprising a nucleic acid 

molecule, said nucleic acid molecule comprising a nucleotide 
sequence that encodes a polypeptide comprising a universal 
conserved domain of a coronavirus or an immunogenic fragment 
or derivative thereof; said polypeptide having less than a 
complete amino acid sequence of said S protein. 

5 • A method of protecting an animal against coronavirus 

comprising administering a polypeptide comprising a universal 
conserved domain of a coronavirus or an immunogenic fragment 
or derivative thereof; said polypeptide having less than a 
complete amino acid sequence of said S protein. 

6 - A method of protecting an animal against coronavirus 

comprising administering a nucleic acid molecule comprising 
a nucleotide sequence that encodes a polypeptide comprising 
a universal conserved domain of a coronavirus or an 
immunogenic fragment or derivative thereof; said polypeptide 
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having less than a complete amino acid sequence of said S 
protein. 
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